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Stepl 
Step2; 
Step3: 
Step4: 



How Rings assign address space 

eli^irroming address to self (to some paver of 2) 
asa^ithe resiitto self address 

nextjadd: - self_addr + sdf_adct_space; // mmber of re^ster usedlocally 
send doiwnnext addr 



Example: 

Dma needs 16addc 
Uart needs 4 
Tmer reeds 256 



self =32 



Addr = 36 self- 256 



Emigrate rressage 
Addr =8 




Addr =512 
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9 



-d 



-4 



clkl 



clk2 



If the delay between "clkl" and "clkl" 
greater then the delay from Q to d of 
second flipflop, we have a race on our 
meaning right hand flipflop will 
sample the data of Q a whole clock period 
early. 




6*f compound A compound B 



clock runs with data 

the problem is possible race. 
However, we control the logic on 
each flipflop leaving the compound, 
because it is always the same standart 
ring- interface module, we can ensure, 
that the delay will be at least enough. 
And more importantly easily checked 
after layout. 
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10 



compound , 



data a 



-J^, clock opposes data 

this arrangement has the advantage 



compound B 



of auto ensuring the no race 
-•■condition (at least in this simple 



data_t>case) exists 



T 
1* 




data_a which changes after clkb, 
which is later then clka, is sampled 
by clka. NO RACE. 



7 



A 1 



clka 



90 
/ 



7? 



St* 




ok \ v 

compound B 



compound A 
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Qbr 



-2!> 



latch 4^ 



clkb 



-7* 



Qbr 



fr 



d - 



clka 



data ^3- 



clock 



clock 




clka 



clkb 
data_a 




data_a leaving the bridge goes to member "b" and 
there should be sampled by rising of clkb. clkb 
lags a lot behind clka of the bridge. As clearly seen 
from the waveforms, race is eminent. Here we 
clock should add latches for all the data lines (-90). 

Adding latch works however if the delay between 
clka and clkb is less then 75% of cycle time, 
otherwise the uncertainty kills the usable time. It 
sets hard limit on the number of ring members. 
Also keep in mind that latches needed on each OK 
signal between members of the ring 
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data 




clock 




icertainty range 



Here, data_b leaves member "b" to be sampled by 
elka in the bridge. But now elkb lags a lot behind 
elka. This actually works to our advantage, If the 
lag is smaller then better part of clock cycle. This 
solution looks better, because between adjacent 
members, we can take care to delay the datas 
beyond danger zone of clock delay, the OK signals 
are covered automatically, and last leg data is also 
covered. The only signal not safe is the OK from 
bridge to "b** member. It will need a latch in "b'\ 



big module 



F'9 



)l 



local_clock 

local_data_out 



data from 
previous member 

elk 



ring interface 




clock 



local clock lags behind 
ring_interface clock of this 
module, because we presume the 
module is big. for data_coming 
out, it is not a problem, it changes 
later then ring-i/f flipflops clock. 
However for data entering the 
module from previous member, 
the race is a possibility we must 
look into. 
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17. 



II* 




if module "a" sends a message to module "b", ring works 
fine. However if most of the traffic is from "c" to "b'\ 
this is more expensive in terms of latency. 



Another problem is "peak latency". Suppose that , "a" 
transmits mostly to "d" and "b" mostly to "c" In this case 
communication between "b" and "c" suffers degradation 
in case that peak traffic coincide. 
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Land bridge gets its name from the fact that it is 
a luxury. It spans across connected modules. 
The idea is simple. When V2 sends message to 
Dl it gets to one side of the bridge. This side 
analyzes the destination address and by some 
magic (explained later) decides to short-cut the 
path. The message re-appears at the other end of 
the bridge and gets fast to Dl . By same magic, 
message fromVI to D2 get bypassed also, 
message fromVI to Dl is treated directly. 




Enumeration is started by "Anchor" 
which assigns address=l to itself, results 
of enumeration are labels I to 7. land 
bridge gets two addresses , as if it were not 
one module, there is "near" end, that got 
enumeration label "3", and the "far" end 
marked 6. 
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msgl and msg2 arrive at the same time, 
the bridge end must make a decision 
which message to forward first. 

It can be shown that unwise decision can 
lead to freezout, deadlock and option price 
dropping to 5$. 

Therefore MSG2 gets the priority. 
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umsg 




IV 



Pi $ 
17 



fifo 

/logic\ _ 



r dmsg 



dmsg 



At 



umsg 



fifo 
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Bridge takes responcibility for strays, 
but only at the "far" end. During 
enumeration, bridge is "polarized" to 
have near and far end. Near is the end 
first struck by enumeration message. 



So we have exactly one enforcer for each 
ring. 



3 near 




11 far 



In land bridge ring, the situation is trickier. If V2 
send message to address=5. The land bridge 
divert at 1 1/far end. it will re-appear at 3 and 
start cycling forever. 

We have to define an algorithm that will take 
care of all cases. 

Luckily there is a way. 

Land Bridge deals only with messages arriving 
at the far end and being diverted. It marks and 
monitors only those. Messages arriving at near 
end, keep their markings. Messages at fdar end 
going through, are left alone. 
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6 



type. 



ff) addi 



ok 
scantiest 



reset 



clk(?) 



i 



elk 



type 



ok 



idle / msgA 



X ms s B 



during the first clock, OK remains active, when type 
is of msgA. It means that on the next clock, 
member A may send new message, member A uses 
this ok to send msgB on the next clock. msgB gets 
stuck for a clock because OK goes inactive. It goes 
inactive because the fifo in memberB is full. One 
clock later, the fifo has a free entry, so OK returns to 
1 and type returns to idle next clock, return to idle 
could also be change to next message, if there was 
one. 



imsg iok 



omsg ook 




umsg 



dok^ 



uok 
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19 



0 



The incoming messages are examined first 
to see if it is supervisor or work/program. 
Work/program messages have address field. 
We check if it is our address. Since we know 
that our address is aligned to our power of 2, 
The address mask (named split mask) 
causes only certain number if upper bits to 
be compared. The lower part of the address 
is passed inside as internal address. The 
upper bits are compared against self-address 

register. This register gets its value during , 

enumeration protocol. The lower part of this f* comparator^ 
register is always masked,. Hopefully i * 

synthesis will delete the unused bits I 

implementation. ours/through part of the address 



incoming 
address 

address 



dont care 
of self- ■ 



split mask 

self address 
register 



\ 

21 1 



that enters the member 
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Member 



Rif_I_* Rif_* 

J i_ 



Rif_o_* modulejd 



RIF 



Address Space m 7 



Activation register 



R<5 



30O 



"SO 
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Member 



Rif I * Rif * Rif o 



module id 



♦ ♦ ♦ + 



RIF 



Address Space = 7 



Activation register 



Pic?. 



300 
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the second land bridge solves most traffic problems, but 
adds 4 clocks in the overall ring length. This is not a big 
problem because no message should travel the whole 
perimiter. 



33 4- 



35- 




The Utopia interface is 
forced into mode that 
communicates in 
messages, not cells. We 
using the I/O and maybe 
some of the logic. 



3^ 



Application 

Specific 
Accelerators 

CRC 
Encryption 
Table Lookup 
Hashing 

3^* 



Internal Memory 35^2. 

Fast, Unified, Multi-port 



Network Processor 



Peripheral Expansion 

Enet, ATM, Uart, USB, Serials 



System 
Expansion 
Area 

CPU (PP) 

DMA 
Smart FIFO 
Ext. mem I/F 



2>5o 
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Internal Memory 



3 «4 




Vobla Core 



L__L__L__L_ ^5 



agi 



i'flcc Debug 



DMA Agent 



* Agcnl Bus 



xtcrtul WikM IT 



Vobla Compounds 



F< 



5 

51 
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Current Tahk | Tuk^X 



3* 




Next Task | Ta«k_Y | 
loMcmorv <*™> " 
t * 















Tor 




Active 

(Active 
Reg, I- ile) 









Shadow 1 

(Active 



jH& i£n 




After a taik switch 



I Source 
of I ask X 



Current Task |Taak Y 
<CI ID) 



Next Ta!.k | Ta«fc_Z \ 
To Memory < N " D > ' " 




Dcsttnutiim 



Preload regs 
of Task Y 




3<jo 



_i x 

Preload regi. *X 
of Task / IP" 



3? 2- 



Tag 



Acttv« 

(Acme 
Reg. File) 





tut/missS 
pf Sourer 1 



5 



I Source Operand 
*of Task Y 



\ 
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Other date 




37 



stream data out 



32 registers of 
32 bits per laik, 

A set ot indications 
per task, which 
control ta&k execution 
sdkeduting 

An interface to 
adjaecni resources ^ 

Fast memory accessed 
by load/srtiwe 
instructions 



$1* 



oeneral 
purpose 
registers 



1 Dowbelfta I 



Agent 
interface 



Internal 
memory 



Special 
purpose 
registers. 



wit igu rat ion 
gisiers 



External 
memory 



per task and 
global rcgiittera 



Imiiatt/ed by 



Pi j • 



the PP 



Big area accessed 
vta a DMA interfile*! 
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R1 register: 



» 6 * 6 J + V • | u 




s - sticky bit 

eq - equal/zero 

ft - less then/negative 

«t - grcaxcr then/positive 

c - carry 

mb - reflection of the RAM multi-reader busy indication. 





MINW-JX 1PR 
Iw-if index - J) 



liiiii 11 li I 
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Frame structure 
of an example 
task type 



r27. 



43 



common 
task data 


task 

fragment 1 
data 






task fragment 
2 data 


task 

fragment 3 
data 


data of all levc!2 functions 


level 1 fl 
data 


level 1 (2 
data 


level 1 f3 
data 







size of IcvclO 
frame part is 
different for 
each task 
type 

4 stzcoflcvcl2 
! frame part is 
til constant 

size of level I 
frame part is 
different per 
each task 
type 
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_interface_addrcss[ 15.0 
j_inlcrfacc_data[3 1 .0J 



type outI7:0](si^etoL'RC ) 

addr!s^out[23:0] 

data out[63:0] 
(alscug C RC) 
_^ P*l'lk 

ok2drivc 



ft 



IdatafaddOl 




agent command 
(AID»multJreadJ 


opcode 


options[5:0] 


RA 


A1D[5.0] 


RB/imm8 





vobla entry 
in muttireader 

increment first «.noop last 
a d S-r tm (« P 2)(op.) (opO) 
(op3) 



source address[23:0]j|destinatiot*ddrcss[23:0] 



count(7 0] 



source addr in sram address of desienation number of bytes 
- — - ~ " to transfer 



fit) 

Iff 



J0* 
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57 



Vobla 



agent interface 



shadow 
entry 



output 
message 
encoder 



type out[7:0J 

address_ om[23:Q] 

data5ut[63:0] 



reset 



. elk 



ok2drive 



agent command 



(AID-message_sender) 
(option[6] =0) 




optk»ns[9*0] 



RA 



messaga_sttnder3ii o,options[5 01 1 raw_data[3 1 :0] 



A1DI4.0] 



raw_address[23 :0] 



^pe[7:0] ' 



message data 



address_of_dcstcnation message type 



agent command 



{AID=mesaage. 
(option[6] »1) 




options[0:0] 



RA 



Air>[4:0] 



message_sen derfc 





{ l t o P Uons[5:0]I rnw_data[63:0] | raw^addrcss[23.0] 



10OO0O0O 



message data 



addrcss_of_dcstcnarion message type 



6Z 



Vobla 




<_id[5:0] 



request 
entries 
X2 



DMA 
context table 



if 



6* 



input 

message 
decoder 



Jype_Jn[7:0] 



wnS interface_addrcssl23 :0] 
\vrup_tntcrfacc_datfl [ 3 1 .0] 
^MbascT addrcss[7 :0J 

type out[7:01 
address_out(23 -0] 
Sta^!uT[63:0] 




"A 
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entry 




agent command 
(AIDadma agent] 



I roquos 



urge _ 
:Ss (OR) (OPI)set ack ai 
(OP9> (<)po> (OP3)( OPIO) 



sram address 



{count[7;Q]} ' 

number ot" bytes 
to transfer or 
address modifycr 



last J n_tran sfe*. 



k st_jn_ 



calculate^ rc 



fraJR 3 



register 




input 


file 




buffer 


* 


* 



on dmand 




^ c 



agent command 
(AID=CRC agent) " 



type[2:0] size[2:0; 



DATA[63:32] 



DATA[31:0J 



| opcode 


options[9:0] 


RA 


A1D[4:0] 


II - 1 







RESIDUE 



CRC type data size gerwateoperation overwi 
check niode readu 



'rite 
lue 
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4* 



Vobla 



igent interfa ce 



D control 
register 



counter 



pre scaleF 4- 



div 

by 2 



reset 



time stamp 
register 



^4 



agent command | opcode 
(AID^tlmar agent) 
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agent com 
(AID^doorbell 



mancfl op< 
bell agwiYT 



peode 



optlon!»[9:0] 



RA 



I AID[4 



:0J 



RB/imm8 



set/clear clear clc; 
global request masl 
(OP2) <OPl) COPO) 



ar 

isk 



{ 0,0.0,0,0,mask_bit_indcx[2:0] J wr jte mask 
or 

{0,0,0,0, 1, req_bitjndex[2:0]} write request 
or 

{ l,0,0,0,0,count_value[2:0]J write counter 
or 

{0,1.0,0,0,0,0,0} write TGMR 
or 

{1,1 ,0,0,0,0,0, urgcntvaluc } write urgent 
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'rings input 




APP ID= 10064340 



Page 263 of 280 



A, :'.J 8J a,,..: 1'? '-'a- 4 Ji . -J. J,.* 7.../ fJ.Lt u. 




APP ID= 10064340 



Page 264 of 280 



Memory 



Trajan chip 



Memory 
port 



CS 



cs 



Addr 



t F1FQ„RD 
,FIFO_WR_ 



WR FIFO 



RD FIFO 



FPGA 



Encryption 



DSP 



PCI 



1# 



Ring 
interface 



Host#l (PP) 
Host #2 (DMA) 



DATA 



Message sender 



Wnte request 
generator 



TT 



Read request 
generator 



WR FIFO used 
words counter 



II 



RD FIFO used 
words counter 



Host #3 (Ring extender) 



Memory controller 



Trajan 
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interface 
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input 



Trajan FIFO.WR FIFOJWft 
input 



Host address 
generator 



Target address 
generator 

RAM / FIFO 



Interface to 
external 
device 
(device 
specific) 



Synchronizer 



Synchronizer 



— > Synchronizer 



Interrupt 



FPGA 



APP ID- 10064340 



Page 266 of 280 




r ing \r\ 



setup registers: 



mac 



-A 



doorbell, taskid and viscode 
urgent viscode , 
header len and address 
threshold to urgent 



^rx manager )" 



-fife 



setu y 



data 
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'free 


crc 


ovrn 


err 


last 


size 




rx status word 










J 



APP ID=10064340 



enet tk 



setup registers: 



mac 



r ing 



flfo 

ram 



setu 3 



'TJmanaBer- 



doorbell, taskid and viscode 
urgent viscode 
send ahead address 
threshold to transmit 
threshold to urgent 



doorbell 
free entry count 
inished frames com 



□ 



(ring <^o 

F/5 



ntrol 




ring c 



APP ID= 10064340 



/ 

:% O V.3 * *■ : j * j-.f s • A . Tl ^ S £i iu 1 



Control Plane 

Signaling Protocols 
Protocol Management 
Exception Handling 
System Control ft 



Data Plane 

Par/packet handling 
Forwarding Decision 
Classification 
OoS Handling 
Queuing 
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